Multiscale Document Segmentation1
نویسندگان
چکیده
In this paper, we propose a new approach to document segmentation which exploits both local texture characteristics and image structure to segment scanned documents into regions such as text, background, headings and images. Our method is based on the use of a multiscale Bayesian framework. This framework is chosen because it allows accurate modeling of both the image characteristics and contextual structure of each region. The parameters which describe the characteristics of typical images are extracted from a database of training images which are produced by scanning typical documents and hand segmenting them into the desired components. This training procedure is based on the expectation maximization (EM) algorithm and results in approximate maximum likelihood (ML) estimates of the model parameters for region textures and contextual structure at various resolutions. Once the training procedure is performed, scanned documents may be segmented using a fine-to-coarse-to-fine procedure that is computationally efficient.
منابع مشابه
Persian Printed Document Analysis and Page Segmentation
This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...
متن کاملMultiscale document segmentation using wavelet-domain hidden Markov models
We introduce a new document image segmentation algorithm, HMTseg, based on wavelets and the hidden Markov tree (HMT) model. The HMT is a tree-structured probabilistic graph that captures the statistical properties of the coeecients of the wavelet transform. Since the HMT is particularly well suited to images containing singularities (edges and ridges), it provides a good classiier for distingui...
متن کاملMultiscale Multiphysic Mixed Geomechanical Model for Deformable Porous Media Considering the Effects of Surrounding Area
Porous media of hydro-carbon reservoirs is influenced from several scales. Effective scales of fluid phases and solid phase are different. To reduce calculations in simulating porous hydro-carbon reservoirs, each physical phenomenon should be assisted in the range of its effective scale. The simulating with fine scale in a multiple physics hydro-carbon media exceeds the current computational ca...
متن کاملA FEM Multiscale Homogenization Procedure using Nanoindentation for High Performance Concrete
This paper aims to develop a numerical multiscale homogenization method for prediction of elasto-viscoplastic properties of a high performance concrete (HPC). The homogenization procedure is separated into two-levels according to the microstructure of the HPC: the mortar or matrix level and the concrete level. The elasto-viscoplastic behavior of individual microstructural phases of the matrix a...
متن کاملDual Irregular Voronoi Pyramids and Segmentation 1
Dept. for Pattern Recognition and Image Processing Institute for Automation Technical University of Vienna Treitlstr. 3/1832 A-1040 Vienna AUSTRIA Phone: +43 (1) 58801-8161 Fax: +43 (1) 569697 E-mail: [email protected] PRIP-TR-27 March 24, 1994 Dual Irregular Voronoi Pyramids and Segmentation1 Dieter Willersinn, Etienne Bertin2 and Walter Kropatsch Abstract We continue previous work about t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997